Skip to content

Conversation

@KJ7LNW
Copy link
Contributor

@KJ7LNW KJ7LNW commented Apr 6, 2025

Context

This PR improves the XML output format in readFileTool.ts to make it more structured and less prone to misinterpretation by the model.

Implementation

  • Removed unnecessary XML indentation that could confuse the model
  • Separated file content from notices and errors using dedicated tags
  • Added line range information to content tags
  • Handled empty files properly with self-closing tags
  • Added comprehensive test coverage
  • Fixed carriage return handling in readLines()

Test Examples

Example 1: Definitions Only (maxReadFileLine=0)

<file><path>package.json</path><notice>Showing only 0 of 66 total lines. Use start_line and end_line if you need to read more</notice><list_code_definition_names>

# package.json
1--66 | {
20--45 |   "contributes": {
21--26 |     "commands": [
22--25 |       {
27--35 |     "viewsContainers": {
28--34 |       "activitybar": [
...
</list_code_definition_names></file>

Example 2: Content with Line Range

<file><path>package.json</path><content lines="1-66">
1 | {
 2 |   "name": "vsce-test-terminal-integration",
 3 |   "displayName": "Terminal Command Runner",
 4 |   "description": "Test VSCode terminal shell integration sequence race condition between parsing and AsyncIterable consumers",
 5 |   "version": "0.0.1",
 6 |   "engines": {
...
 </lst_code_definition_names></file>

Example 3: Empty File

<file><path>t</path><content/><notice>File is empty</notice></file>

Example 4: Error Handling

<file><path>src/nonexistent-file.ts</path><error>Error reading file: File not found: src/nonexistent-file.ts</error></file>
[read_file for 'src/nonexistent-file.ts'] Result:
The tool execution failed with the following error:
<error>
Error reading file: {"name":"Error","message":"File not found: nonexistent-file.ts",...}
</error>

How to Test

  • Run the existing tests
  • Try reading files with different line ranges and verify the XML output format

Get in Touch

Discord: KJ7LNW

Fixes #2278
cc: @hannesrudolph


Important

Refactor readFileTool.ts to improve XML output structure, add error handling, and enhance test coverage.

  • Behavior:
    • Refactor XML output in readFileTool.ts to improve structure and clarity.
    • Separate file content from notices and errors using dedicated tags.
    • Add line range information to content tags.
    • Handle empty files with self-closing tags.
    • Add error handling for invalid paths, start_line, end_line, and RooIgnore errors.
  • Tests:
    • Add comprehensive test coverage in read-file-maxReadFileLine.test.ts, read-file-xml.test.ts, and extract-text.test.ts.
    • Test cases for different maxReadFileLine values, binary files, and range parameters.
    • Validate XML structure, line range attributes, and error handling.
  • Functions:
    • Modify addLineNumbers() in extract-text.ts to handle trailing newlines and empty content correctly.
    • Update readLines() in read-lines.ts to process data chunks directly and handle edge cases.

This description was created by Ellipsis for 1c7d1b24f260be734f7e4c9c227346375348d8f1. It will automatically update as commits are pushed.

@changeset-bot
Copy link

changeset-bot bot commented Apr 6, 2025

⚠️ No Changeset found

Latest commit: 5413e47

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@hannesrudolph
Copy link
Collaborator

@mrubens

@hannesrudolph hannesrudolph moved this from New to PR [Pre Approval Review] in Roo Code Roadmap Apr 6, 2025
Eric Wheeler added 7 commits April 8, 2025 15:26
Empty files should not have line numbers, but non-empty files with empty content at a specific line offset should.
- If content is empty, return empty string for empty files
- If content is empty but startLine > 1, return line number for empty content at that offset
This ensures that the model does not think the file contains a single empty line.

Signed-off-by: Eric Wheeler <[email protected]>
- Remove unnecessary XML indentation that could confuse the model
- Separate file content from notices and errors using dedicated tags
- Add line range information to content tags
- Handle empty files properly with self-closing tags
- Add comprehensive test coverage

Fixes #2278

Signed-off-by: Eric Wheeler <[email protected]>
- Always display line numbers in non-range reads
- Improve XML formatting with consistent newlines for better readability

Signed-off-by: Eric Wheeler <[email protected]>
- Update test expectations to match the new XML format with newlines
- Update tests to expect line numbers attribute in content tags
- Modify test assertions to check for the correct line range values

Signed-off-by: Eric Wheeler <[email protected]>
- Add newline to all output
- Handle trailing newlines and empty lines consistently
- Add test cases for blank lines:
  - Multiple blank lines within content
  - Multiple trailing blank lines
  - Only blank lines with offset
  - Trailing newlines

Signed-off-by: Eric Wheeler <[email protected]>
- Modified extract-text mock to preserve actual addLineNumbers implementation
- Removed mock implementation of addLineNumbers
- Updated test data to account for trailing newline
- Removed unnecessary mock verification

Signed-off-by: Eric Wheeler <[email protected]>
- Replace direct mocking of addLineNumbers with spy on actual implementation
- Add verification to ensure the real function is called when appropriate
- Add skipAddLineNumbersCheck option for cases where function should not be called
- Update test cases to use appropriate verification options
- Fix numberedFileContent to include trailing newline for consistency

Signed-off-by: Eric Wheeler <[email protected]>
@KJ7LNW
Copy link
Contributor Author

KJ7LNW commented Apr 8, 2025

This turned into a bit of a wild goose chase when I discovered that content is not always being read correctly particularly with multiple carriage returns at the end of a file. This has been corrected as part of this pull request in 8c9caa2

I believe this is ready for review and merge.

@KJ7LNW KJ7LNW marked this pull request as ready for review April 8, 2025 22:29
@KJ7LNW KJ7LNW requested review from cte and mrubens as code owners April 8, 2025 22:29
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. enhancement New feature or request labels Apr 8, 2025
@KJ7LNW
Copy link
Contributor Author

KJ7LNW commented Apr 8, 2025

one more test to fix ...

- Direct data processing provides more accurate results by preserving exact content with carriage returns
- Improved performance through minimal buffering and efficient string operations
- Use string indexes to find newlines while maintaining their original format
- Handle all edge cases correctly with preserved line endings
- Add tests for various edge cases including empty files, single lines, and different line endings

Signed-off-by: Eric Wheeler <[email protected]>
@mrubens
Copy link
Collaborator

mrubens commented Apr 8, 2025

Great! Will look soon

Remove unused variable declaration to appease ellipsis-dev linter requirements.

Signed-off-by: Eric Wheeler <[email protected]>
@KJ7LNW
Copy link
Contributor Author

KJ7LNW commented Apr 8, 2025

all tests are passing, this is ready for review

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Apr 9, 2025
@mrubens mrubens merged commit 270fd88 into RooCodeInc:main Apr 9, 2025
12 checks passed
@github-project-automation github-project-automation bot moved this from PR [Pre Approval Review] to Done in Roo Code Roadmap Apr 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request lgtm This PR has been approved by a maintainer size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants